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1.  Introduction.  Our  purpose  herein  is  to  introduce  a  model  for  quality 


control,  and  to  characterize  a  policy  which  maximizes  the  given  payoff 
function.  Envisage  a  machine  with  two  internal  states,  0  and  1.  Starting 
at  state  0  at  time  zero,  it  manufactures  an  item  which  is  either  defective 
or  non-defective  and  then  in  unit  time,  either  remains  in  state  0  or  goes 
to  state  1  according  to  the  scheme. 


Note  that  1  is  an  absorbing  state.  After  every  transition  the  machine 
manufactures  another  item.  There  are  different  probabilities  of  the  item 
being  defective  or  non-defective  according  as  the  state  of  the  machine, 
and  given  by: 


where  p  >  p, ,  a  <  1. 
o  *T 

After  any  number  of  itemshave  been  turned  out,  the  machine 
may  be  stopped  for  an  integer  time  T,  and  when  this  repair  period  T 
is  over  it  is  in  state  0  and  the  manufacturing  process  begins  again. 

If  there  is  profit  C  for  every  non-defective  item,  cost  D  for 
every  defective  item  and  a  charge  G  for  every  time  unit  that  the  machine  is 
in  repair,  then  roughly,  we  wish  to  maximize  the  long  run  profit  per  unit 
time. 
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By  a  policy  is  meant  some  method  of  deciding  when  to  stop  the 

machine  and  repair  it.  The  difficulty,  of  course,  is  that  we  cannot 

directly  observe  the  state  of  the  machine,  but  only  the  condition  of 

the  manufactured  items.  If  our  stop  and  repair  policy  is  too  erratic,  it 

may  be  that  the  limiting  cost  per  unit  time  does  not  exist.  We  extricate 

ourselves  from  this  difficulty  by  considering  only  "sensible"  policies 

Our  main  result  is,  that  among  these  latter,  the  optimum  is  of  the 

* 

form:  there  is  a  number  X.  such  that  when  the  conditional  probability 

that  the  machine  is  currently  in  state  1,  given  all  the  observed  items 

* 

up  to  the  present,  exceeds  X.  ,  stop  and  repair.  This  policy  does  not 
seem  to  reduce  to  any  of  the  standard  quality  control  techniques. 

This  problem  is  similar  to  those  treated  by  Howard  [1]  ,  except 
that  the  pertinent  Markov  chain  has  an  infinite  number  of  states.  We  were 
led  to  it  by  our  work  on  stopping  rules,  and  the  treatment  is  an  interesting 
example  of  some  of  the  techniques  mentioned  in  [2].  In  the  section  to 
follow  we  reduce  the  problem  to  a  stopping  rule  problem.  Following  that  proce¬ 
dure  we  show  that  the  stopping  rule  problem  has  solutions  of  the  desired 
form.  This  result  is  summarized  in  Theorem  2  of  section  4. 

2.  Notations  and  Reduction  of  Problem.  The  class  S  of  policies  we  will 
work  within  are  defined  by:  a  policy  is  in  S  if 

i)  it  leads  to  stop  and  repair  infinitely  often  with  probability  one. 

ii)  the  decision  as  to  whether  to  stop  and  repair  is  based  only  on 
the  sequence  of  items  turned  out  since  the  end  of  the  previous 
repair  period. 

iii)  if  a  sequence  of  items,  following  some  repair  period;  leads  to 

a  decision  to  stop  and  repair,  then  this  same  sequence  following 
any  repair  period  leads  to  that  decision. 

(iv)  the  expected  duration  of  running  time  between  repair  periods  is 
finite. 
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There  is  possible  here  an  alternative  approach  which  starts  from  a 
much  larger  class  of  policies  and  shows  that  for  each  policy  in  the 
larger  class  there  is  a  policy  in  S  that  is  at  least  as  good.  See  Blackwell 
[3],  for  instance.  But  this  would  take  us  into  theory  far  afield  from  the 
present  example. 

Under  a  policy  in  S,  define  the  first  cycle  of  the  machine  to 
be  its  history  from  time  zero  to  the  end  of  the  first  repair  period, 
but  not  including  the  first  item  manufactured  after  repair.  The  second 
cycle  begins  with  this  latter  item  and  extends  similarly  to  the  end 

of  the  second  repair  period,  and  so  on.  Let  R  be  the  total  profit  during 

th  th  ^ 

the  k  period,  the  length  of  the  k  period.  Then  the  profit  per  unit 

time  over  the  first  n  cycles  is 

R,+.  .  •  +R 
1  n 

N.+.  .  .  +N  ' 

1  n 

Under  a  policy  in  S,  R^,  •  .  .  ,  Rn  are  independent,  identically  distributed, 

with  E  |R.  |  <  oo,  and  similarly  for  the  N  , .  .  .  ,N  -  The  law  of  large  number 
I  In 

is  thus  in  force  so  that  with  probability  one 


lim 

n 


R.+.  •  •  +R 
1  n 

N.+.  .  .  +N 
1  n 


ER 

EN 


1 

1 


This  relationship  reduces  the  problem  to  an  analysis  of  only  the 
first  cycle,  i.  e.  ,  to  find  a  policy  which  maximizes  ER^/EN^.  For  the  real 
valued  parameter  j3,  define 


<MP)  =  sup(ER1-PEN1) 

where  the  sup  is  over  all  policies  in  S-  Note  that  4>(p)  is  decreasing 
in  p,  since  ER^-PEN^  is  decreasing  in  p  for  every  fixed  policy.  Also 
4>(P)  >  -  co,  all  p,  since  ER^  -PEN^  >  -oo  for  the  policy:  stop  and  repair 
after  one  item.  Let  p  =  inf  |p;<f>(P)  <  oo}  ,  then 


3  - 


* 


Proposition  1:  P  <  oo.  There  is  a  unique  number  ft  >  p  such  that 
<p(P  )  =  0  and  p  >  C-(C+D)g1. 

Proof:  Suppose  p  >  C,  then  the  maximum  amount  we  can  make  in  any 
period  is  C,  but  because  we  are  being  charged  an  amount  p  for  every 
period  (because  of  the  term  -pENj)  it  follows  that  cf>(P)  <0,  p  >  C.  For 
each  policy  in  S,  ER^-PEN^  is  linear  in  p,  hence  <j>(p)  is  concave  on  (Pq,  oo) 
and  thus  continuous.  Since  4>(Pq)  =  °°»  for  anY  given  number  M  there  is  a 
policy  such  that  ER^  -p^EN^  >  2M.  By  continuity,  there  is  an  <£  >  0  such 
that  ERj ~(PQ+  £)EN^  M,  so  that  4>(P  +  £  )  >0.  Since  4>(p)  is  decreasing 

Vfi 

and  concave  there  must  be  a  unique  solution  p  of  <j>(P)  =  0. 

* 

Now  assume  that  P  <  C-(C+D)qi  and  considers  policy  that  continues  for 
n  items,  n  large,  and  then  stops  and  repairs.  Since  the  machine  is  in  state 
1  with  probability  tending  to  unity  as  more  and  more  transitions  go  by,  ER^ 
is  equal  to  (Cpj-Dq^)n  plus  terms  of  lower  order  in  n.  But  Cp^-Dq^=C-(C+D)q^ 

5$C 

so  that  ER^-P  EN.  is  equal  to  [C-(C+D)q^-p  ]n  plus  terms  of  lower  order, 
contradicting  6><R  )  =  0. 

*  * 

For  this  number  p  ,  we  have  sup(ER^~p  EN^)=0,  so  that 

* 

ER^-P  EN^  <  0  for  all  policies  in  S.  Hence 


sup 


EN, 


=  P 


* 


Further,  if  there  is  a  policy  which  achieves  the  optimization  of  ER^-P  EN 
then  this  same  policy  optimizes  ER^/EN^. 

A 

Let  R  be  the  profit  from  the  items  manufactured  under  a  given 
policy  and  N  the  number  of  items  manufactured. 

A 


1’ 


R  =  R  -  GT 


N  =  N  +  (T -1 ) 

%  A  *  A  if 

and  ERj^-p  EN^  =  E2R  -  p  EN  -  [GT  +  p  (T  — 1 ) ].  Define  random  variables  by 


4 


f  C  if  k1*1  Item  is  non-defective 


X  = 

Xk 


th 


-D  if  k  item  is  defective 
and  using  J  to  denote  R-f}"^,  we  write 


N 


EJ  =  E 


I  <v' 


) 


k-1 


By  an  interchange  of  summation  and  integration,  valid  since  EN  <  co 
(see  Doob  [5]>  for  instance),  this  becomes 

“  f 

ej  =  X  I  <xk-p*dp 

k=l  |n  >  kj 

Define  U^  =  E(X^|X^  j,...,X^),  then  since  the  sets  _>  depend  only 
X^  y  .  . .  ,  X^.  We  rewrite  again 


EJ 


■I  J.M 

k=i  ■In  >_k| 


dP 


if  we  put  =  P(Xk  =  -D I  X^  then 


so 


Uk  =  C(l-Vk)-DVk, 

y\ 

N 


EJ  =  E  (-(C+D)Vk+C-P  ) 

k=l 

The  situation  of  maximizing  EJ  may  be  described  as  follows:  for  the  k 
item  we  receive  a  fee 


th 


f(Vk)  =  -(C+D)Vk  +  C-P 


on 


and  are  free  to  stop  at  this  point  or  to  go  one  more  item. 


Parts  of  the  above  reduction,  using  cycles,  to  a  stopping  rule 
problem  have  been  used  before  in  other  contexts,  and  the  appropriate 
references  are  in  [2]. 

3.  Reduction  to  a  Functional  Equation 


The  next  pertinent  fact  is: 

Proposition  3:  The  V  form  a  stationary  Markov  chain  on  [ q^,  q^]  such 
that  if 

v(q  q 

F1  V  =  - v - 

V(  apo-q1)+q1(l-QPo) 


then  if  =v,  Vk  ^  is  either  F^(v)  or  F^(v)  with  probabilities  v,  1-v. 
Proof:  We  have  that 


Vk+1  =  P(Xk+l=-DlXk'--"Xl) 

TP(Xk+1  =  -D|Xk=  -D,  X^ . Xj),  probability  Vk 

[PIX^  =  -DjXk=C,  xk  l,.  •  .  .Xj),  probability  1-Vk 


*  We  introduce  variables  Y,,  Uk  defined  by:  Yk  is  the  state  of  the  machine 
just  prior  to  the  manufacture  of  the  k  item,  and  Uk  =  P(Yk|  Xj^  ,  X^). 


Denoting  y  1  =  P(Xk+1  = 
.  .  .  ,  Xj),  there  follows 


-D|Xk  =  -D,  X^, 


>■  V2"P(Xk+1--D.|Xk-C. 


Vi  =  p'xk+i  =  -D-  xk=-Dlxk-1'")/vk 

>rp^f-D'  xk  =  c|xk -r... )/i-vk. 
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Thus, 


'  lVk  =  P(Xk«  =  ‘D’  Xk  =  -D I  Yk  =  1)Uk  +  P(Xktl  =  -D.  Xk  =  -D I  =  °>»-V 


=  qlUk  +  qo[a'1ot(1-“)ql1|1-Uk) 

=  (qj  +  aqo)(q1  -qo)Ufc  +  aqj  +  (1-a),^ 

V 2(1-Vk)  -  P(Xk+1  =  -D.  Xk=  C  |Yk  =  1) Uk+P(Xktl  =  -D,  Xk  =  C|  Yk - 0)(1-Uk) 

=  piqiuk+po[aV(l  a)qi]  (1'V 


and  solving  this  for  and  substituting  above  yields  the  given  expression 

for  F, »  F0.  For  ve  Tq  ,  q,  1,  consider  the  functional  equation  for  H 
12  1  o  1 


(A)  H(v)  =  max|0,E[H(V2)  +  f(V2)  |VJ  =  v] 

5j< 

where  f(v)  =  -(C+D)v  +  C-f3  •  This  equation  may  be  derived  in  a  fashion 
similar  to  that  used  by  Bellman  in  dynamic  programming  problems. 
Heuristically,  let  H  (v)  be  the  maximum  payoff  starting  from  V^  =  v.  We 
have  our  choice  of  stopping  and  receiving  zero  or  of  making  the  transition 
to  where  we  receive  the  amount  f(V2)  plus  our  maximum  expected  payoff 
starting  from  V  ,  this  latter  being  H(V2). 

However,  the  above  heuristic  does  not  establish  any  optimality 
properties,  and  while  the  connection  between  functional  equations  and 
optimal  policies  has  been  investigated,  (see  [4],  for  example)  none  of 
the  results  seem  appropriate  for  the  present  problem.  Therefore,  we 
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must  delve  into  the  theory  with  the  following  theorem. 


Theorem  1:  If  equation  (A)  has  a  bounded  solution  on  [q  ,  q.  ],  let  a  denote 

(  1  *  % 

the  policy:  stop  when  V^e  <v;H(v)  =  0j  .  If  s  t  S,  then  s  is  optimal  in  S 
(where  here  S  denotes  all  policies  with  finite  expected  stopping  time). 

Proof:  Let  s  be  any  policy  in  S  with  stopping  variable  ft  such  that 

the  set  |N=k|  depends  only  on  the  values  of  X^, ....  and  EN  <  co. 

For  any  t  >  0,  we  may  take  n  so  large  that 


Thus, 


n 


EJ 


S  I  f(V  )dP-  f  H(V  )dP|  <t  . 

1  {N>k}  {ft>n} 


n-1 


EJ 


<£  +  y  j  f(Vk)dP+  f  [H(V  )+f(V  )]dP. 

1  {ft>kj  jft>nj 

The  set  |n>  n|  depends  only  on  Xj,...  ,  ^  so  we  may  replace  the 

last  integrand  above  by  E[H(Vn)  +  f(Vn)|V  and  by  (A)  then 


EJ  ±€+5  I  f(V.)dP  +  f  H(V 
\  1  {ft>k}  {ft>n} 

But  |n>  and  H(v)  >  0,  so 


)dP 


n-1 


EJ-^  /  f(Vk)dP+  /  H(Vn-l)dR 
1  |ft>k|  |ft>n-lj 

Continuing  to  proceed  this  way,  and  noting  that  =  q^,  we  get 


EJ  <  £  +H(qo)+f(qo)  . 

On  the  other  hand,  let  N  be  the  stopping  variable  given  by  s  ,  and  J  ,  the 
payoff.  Then, 
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n-1 


EJ 


'-'£+Z  I  f(Vk)dP+  /  E[H<Vk>+f(Vn>lVn-l]dP 

1  lNikl  |N*>n] 


by  (A),  on  |n*  >  n-lj  ,  the  last  integrand  is  equal  to  H(Vn  ^),  yielding 


n-1 


EJ 


f(vk)dP  + 


>  / 

1  [N  >kj 

Furthermore,  on  the  set  =n-l]  ,  H(Vn  ^)  =  0,  giving 


J  H(V  )dP. 


n-1 


EJ 


Ve*  X  r  J  «Vdp+(  „  /  »‘Vn-l>dP- 

1  [N  >k]  |n  >n-l} 


Continuing,  we  conclude  EJ  >  -c+Hfq^) +  f(qQ),  which  proves  the  theorem. 

4.  Solution  of  the  Function  Equation 

To  investigate  the  solution  of  (A),  we  first  prove; 

Proposition  2;  If  0(v)  is  monotonic  nonincreasing  on  q^,  q^,  then  so  is 


EOC^Iv^v). 


Proof:  Let  0  (v)  be  defined  as 


(  1  q  <  v  <  v 
1  o—  —  o 


ev  (v)  =  i 

O 


0  v  <  v  <  q 
o  —  1 


Then, 


E<ev  (v2>|v1  =  v)  =  ev  (F1(v))v+ey  (f2(v))(i-v). 
o  o  o 

It  is  easy  to  verify  that  F^(v),  F^(v)  are  monotonically  increasing  in  v  and  that 
on  [qQ,  q^,  F^v)  >  F2<v).  Therefore, 


1. 

E(0v  (V^)  I  Vj  =  v)  =  j  1-v, 

0, 


Fl<v>  1  Vo 

Fl<v>>Vo'  F2(v)  —  Vo 
f2(v)  >  vq 


and  :1s  decreasing.  Since  every  nonincreasing  function  can  be  arbitrarily 
closely  approximated  by  finite  sums  with  positive  coefficients  of  functions 
of  the  type  0^  ,  the  proposition  follows. 

To  try  and  solve  (A)  we  use  an  approximation  procedure,  defined 

by 

H(n+1)(v)  =  max  [0,  E[H(n)(V2 )  +  f(V2)  |  =  v  ]] 

with  H^\v)  =.  0. 

Proposition  3.  The  H^(v)  are  a  nondecreasing  sequence  of  continuous 
functions  on  [q^,  q^]. 

Proof:  Assume  that  \v)  >  \v).  Then, 


H(n+1)(v)  =  maxj0,E[H(n)(V2)  +  f(V2)|V1=v]} 

>  max  \0,  E  [H(n_1) (’ V2)  +  f C VJ I  Vj  =  v  ]] 

=  H<n)(v)  . 

And  since  H(2)<v)  >  0,  we  have  always  H(n+1)(v)  >  H(n)(v).  Furthermore,  if 
H(n)(v)  is  continuous,  then  since  E( 0(V2> |  Vj  =  v)  is  continuous  if  0  is  continuous, 
the  proposition  holds. 

Proposition  4.  H(v)  =  lim  H  (v)  is  a  bounded  solution  of  (A). 

n 

Proof:  Consider  the  function  av+b,  where 


a  =  t-2-  (C+D) 

i-  a 

and  b  is  taken  so  that  av+b>0,  all  v  e  [qQ,  qj  ].  By  a  quick  computation 
E(v2|Vl  =  v)  =  av  +  (l'-Q)q1 
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E[aV2  +  b  +  f(V2)|V1  =  v]  =  (a -C-D)E(V2|  V^vJ  +  b  +  C-P* 

=  (a-C-D)  [av  +  (i-a)q1  ]  +  b+C-P* 

=  av+b+C-(C+D)q1  -  [3* 

<  av+b 

This  last  by  Proposition  1.  Therefore,  if  H^(v)  <  av+b,  then 

H(n+1)(v)  =  Jnaxjo,  E[H(n)(V2)  +  f(V2)  jv^  =v]|  . 

<  max  j^O,  av+b"j  =  av+b. 

This  establishes  that  H(v)  is  bounded.  That  it  is  a  solution  is  quite  evident. 

At  this  point,  we  have  all  the  material  necessary  for  our  main  result. 

Theorem  2:  Either  the  policy:  never  stop  and  repair,  yields  a  larger  ER^/EN^ 
then  any  policy  in  S,  or  there  is  a  number  X.  <1  such  that  the  policy:  stop 
and  repair  when 

P(Yk  =  l|Xk,...,X1)>  V* 

is  in  S  and  is  optimal  in  S. 

Proof:  Let  S  ("  [q  ,  q,  1  be  defined  by 
-  o  1 

Sn=  |v;H(n)(v)  =  o]  . 

By  Propositions  2  and  3,  is  of  the  form  [v^,  q^],  v^  <  q^,  or  empty. 

Since  F  (q  )  =  F2(q1)  =  q^  (A)  gives  H(q1)  =  max[0,  Hfq^  +  gfq^  ]  and  since 

g(q^)  =  C-(C+D)q^-(3  <  0,  is  non-empty  unless  possibly |^=  C  (C+D)q^  . 

Leaving  this  latter  case  for  the  nonce,  S=  |v;  H(v)  =  0j  =0  is  thus 

a  set  of  the  form  [  y,  q  ].  By  Theorem  1,  if  the  policy:  stop  when  V  e  [y.qj], 

=S 

is  in  S,  then  it  is  optimum  in  S.  Assume  first  that  y<  q^,  and  let  N  be  the 
stopping  variable.  Then 

P(N*  >  n)  <  P(Vne'  [v,  qL]  )  • 
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To  continue  this  inequality,  we  use1 


EVn-qip(V  tv  .qj])  +  Y  P(Vn«  [qo»Y  )). 

Toget  q. -EV 

P(Vne'[Y  .qj)  < 

By  their  definitions, 


VY 


EV  =  q  P(Y  =l)  +  q  P(Y  =0) 
n  1  n  on 


and  since  P(Y  =0)  =  a1,  this  gives 
n 


EV  =  q  -  (q  -q  )a  . 
n  i  i  o 


Substituting, 


F(N  >  n)  < 


qrqo  n 


"  ql_Y 


so  EN  <  oo.  Now  we  show  that  if  p  >  C-(C+D)q^,  then  y  <  q^.  For  taking, 
limits  in  (A)  as  v  goes  up  to  q^  yields 


H(q1-)=max  [  0,  H(q*-)  +  g^)  ] 


and  g(q  )  <  0  implies  H(q1-)  =  0.  Therefore,  we  can  certainly  find  a  neighbor 
hood  of  q^,  say  [q^-Ei.q^,  £>0,  on  which 

HtF^vNv  +H(F2(v)  )(1-v)  +  g(v)  <  0, 

and  in  this  neighborhood,  then,  H(v)  =  0. 

* 

Now  for  the  case  (3  =  C-(C+D)q^.  In  this  case, 

sup  ER  /ENj  =  C-fC+D^. 

But  this  is  exactly  the  payoff  from  the  policy  that  never  stops. 


The  theorem  is  stated  in  terms  of  the  variables  P(Y£l|  X^,  .  .  .  ). 

These  are  related  to  the  variables  by 

vk+i  ■  V  V1 1  **•  •  •  •  >  ■ *  I-hO-l +  V>H  >-p( V1  l*k-  •  ■  ■ » 1 

=  a(qi..qo!P(Yk=l  \  Xk,...)+q1(l-a)+qoa. 

* 

This  transformation  takes  into  1  and  y  <q^  into  some  number  X.  <1, 
concluding  the  proof  of  the  theorem¬ 
s'  The  Character  of  the  Optimal  Policy 

We  first  give  a  more  explicit  form  of  the  optimal  policy  by  evaluating 
P.(Yk=l  |  Xk,  Xk  .  . .  )•  Note  that 

P(Yk*  Yk-1 . Yl’  *k'  •  •  •  •  Xj)  =  P(Xk . Xl  I  Yk . Yi)p(Yk . Y!> 


=JT  p(X.|Y.)P(Yk . Yj). 


Define  Q.  by 
J 


V^V1 . V  y° . Yr°>-  J*‘ . k 


(l-a)a^  j  <k 


Q,  =  )  k-1  . 

J  a  ,  J  =k  . 


Then 


P(Yk=l»  Xj^ . X:)= 


j=l  r=l 


k 

P(Xr|Yr=0)JJ  P(Xr|Yr=l)  Q. 
r=j+l 


LetN.=no.  of  defectives  in  the  first  j  trials,  so 


P<Yk=1 


,Xk,...  .X^  -^T  qQ3  Pq  J  qj 


N  j-N.  N,~N.  k-j-(N,  -N  ) 


13 


k,  ql  vNk  1  -  a 


Denoting  z=qoP1/pQq1,  w=pQQ/Pl 

P(Yk=l,Xk,...,X1)  =  p^(~) 
Similarly 

k  ql  ^k  1  -  a 

P(Xk . =  — 

The  optimal  policy  becomes:  stop  when 
k-1 


k-1 


N, 


.  I 

j=l 


wJ 


rk-i 


r —  N,  .  ,  IN  , 

\  j  J  .  1  k  k 

}  z  Jw  +  -  z  w 

C-.  1  -  Q 

l_j=l 


N, 


e—  N. 

y  z  j  wj  > 


**  Nk  k 
X  z  w 


J=1 

**  >(>  * 

where  X  =\  /(l-a)(l-X  ),  Or,  if  is  the  no.  of  defectives  in  the  last 
j  trials,  stop  when 


k"1  J, 

j=l 


Or,  if  I„  is  the  no.  of  non  defectives  in  the  last  j  trials,  stop  when 
J  k-1 

%  P,  h 


V  ^1  j  n  j  -j  ** 

>  <:r>  a  >x  • 
<£_  q„  p„  — 


j=i 

While  this  above  expression  may  or  may  not  be  interesting,  a  more  illuminating 
form  of  the  optimal  policy  was  suggested  by  Roy  Radner.  This  is:  stop  when 


P(Y,  =1  |X.  ,  .  . .  ,  X  ) 

'  k  1  k  1  *** 

~  ^  \ 

P(Yk=0|Xk . Xj) 

The  expression  on  the  left  is  a  likelihood  ratio,  and  the  policy  may  be  stated 
as:  at  every  step,  test  the  hypothesis  that  the  machine  is  in  state  one  vs 
state  0,  given  all  the  relevant  information.  When  the  hypothesis  can  be  accepted 
at  a  certain  level,  stop  and  repair. 
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The  parameter  X.  seems  difficult  to  compute,  although  some 
approximation  methods  are  useful.  As  to  the  shortcomings  of  the  model, 
they  are  more  or  less  apparent,  and  it  is  our  hope  that  more  realistic 
models  will  follow. 
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